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METHODS, SYSTEMS AND COMPUTER PROGRAM PRODUCTS FOR 
SECURITY PROCESSING INBOUND COMMUNICATIONS IN A CLUSTER 

COMPUTING ENVIRONMENT 



Related Applications 

The present application is related to commonly- 
assigned and concurrently filed United States Patent 

Application Serial No. , entitled "METHODS, SYSTEMS 

AND COMPUTER PROGRAM PRODUCTS FOR SECURITY PROCESSING 
OUTBOUND COMMUNICATIONS IN A CLUSTER COMPUTING 
ENVIRONMENT" Attorney Docket No. 5577-219 the disclosure 
of which is incorporated by reference as if set forth 
fully herein. 

Field of the Invention 

The present invention relates to network 
communications and more particularly to network 
communications to a cluster of data processing systems. 

Background of the Invention 

The Internet Protocol (IP) is a connectionless 
protocol. IP packets are routed from an originator 
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through a network of routers to the destination. All 
physical adapter devices in such a network, including 
those for client and server hosts, are identified by an 
IP Address which is unique within the network. One 
5 valuable feature of IP is that a failure of an 

intermediate router node or adapter will not prevent a 
packet from moving from source to destination, as long as 
there is an alternate path through the network. 

In Transmission Control Protocol/Internet Protocol 

10 (TCP/IP), TCP sets up a connection between two endpoints, 

identified by the respective IP addresses and a port 
number on each. Unlike failures of an adapter in an 
intermediate node, if one of the endpoint adapters (or 
the link leading to it) fails, all connections through 

15 that adapter fail, and must be reestablished. If the 

failure is on a client workstation host, only the 
relatively few client connections are disrupted, and 
usually only one person is inconvenienced. However, an 
adapter failure on a server means that hundreds or 

2 0 thousands of connections may be disrupted. On a 

System/390 with large capacity, the number may run to 
tens of thousands. 

To alleviate this situation, International Business 
Machines Corporation introduced the concept of a virtual 
25 IP Address, or VIPA, on its TCP/IP for OS/3 90 V2R5 (and 

added to V2R4 as well) . Examples of VIPAs and their user 
may be found in United States Patent Nos . 5,917,997, 
5,923,854, 5,935,215 and 5,951,650. A VIPA is configured 
the same as a normal IP address for a physical adapter, 

3 0 except that it is not associated with any particular 

device. To an attached router, the TCP/IP stack on 
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System/390 simply looks like another router. When the 
TCP/IP stack receives a packet destined for one of its 
VIPAs, the inbound IP function of the TCP/IP stack notes 
that the IP address of the packet is in the TCP/IP 
5 stack's Home list of IP addresses and forwards the packet 

up the TCP/IP stack. The "home list" of a TCP/IP stack 
is the list of IP addresses which are "owned" by the 
TCP/IP stack. Assuming the TCP/IP stack has multiple 
adapters or paths to it (including a Cross Coupling 

10 Facility (XCF) path from other TCP/IP stacks in a 

Sysplex) , if a particular physical adapter fails, the 
attached routing network will route VIPA-targeted packets 
to the TCP/IP stack via an alternate route. The VIPA may, 
thus, be thought of as an address to the stack, and not 

15 to any particular adapter. 

While the use of VIPAs may remove hardware and 
associated transmission media as a single point of 
failure for large numbers of connections, the 
connectivity of a server can still be lost through a 

20 failure of a single stack or an MVS image. The VIPA 

Configuration manual for System/3 90 tells the customer 
how to configure the VIPA{s) for a failed stack on 
another stack, but this is a manual process. Substantial 
down time of a failed MVS image or TCP/IP stack may still 

2 5 result until operator intervention to manually 

reconfigure the TCP/IP stacks in a Sysplex to route 
around the failed TCP/IP stack or MVS image. 

While merely restarting an application with a new IP 
address may resolve many failures, applications use IP 

30 addresses in different ways and, therefore, such a 

solution may be inappropriate. The first time a client 
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resolves a name in its local domain, the local Dynamic 
Name Server (DNS) will query back through the DNS 
hierarchy to get to the authoritative server. For a 
Sysplex, the authoritative server should be DNS/Workload 
5 Manager (WLM) . DNS/WLM will consider relative workloads 

among the nodes supporting the requested application, and 
will return the IP address for the most appropriate 
available server. IP addresses for servers that are not* 
available will not be returned. The Time to Live of the 
10 returned IP address will be zero, so that the next 

resolution query (on failure of the original server, for 
example) will go all the way back to the DNS/WLM that has 
the knowledge to return the IP address of an available 
server . 

15 However, in practice, things do not always work as 

described above. For example, some clients are 
configured to a specific IP address, thus requiring human 
intervention to go to another server. However, the person 
using the client may not have the knowledge to 

20 reconfigure the client for a new IP address. 

Additionally, some clients ignore the Time to Live, and 
cache the IP address as long as the client is active. 
Human intervention may again be required to recycle the 
client to obtain a new IP address. Also, DMSs are often 

25 deployed as a hierarchy to reduce network traffic, and 

DNSs may cache the IP address beyond the stated Time to 
Live even when the client behaves quite correctly. Thus, 
even if the client requests a new IP address, the client 
may receive the cached address from the DNS. Finally, 

3 0 some users may prefer to configure DNS/WLM to send a Time 

to Live that is greater than zero, in an attempt to limit 
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network-wide traffic to resolve names. Problems arising 
from these various scenarios may be reduced if the IP 
address with which the client communicates does not 
change. However, as described above, to affect such a 
5 movement of VIPAs between TCP/IP stacks requires operator 

intervention and may result in lengthy down times for the 
applications associated with the VIPA. 

Previous approaches to increased availability 
focused on providing spare hardware . The High- 

10 Availability Coupled Multi-Processor (HACMP) design 

allows for taking over the MAC address of a failing 
adapter on a shared medium (LAN) . This works both for a 
failing adapter (failover to a spare adapter on the same 
node) or for a failing node {failover to another node via 

15 spare adapter or adapters on the takeover node.) Spare 

adapters are not used for IP traffic, but they are used 
to exchange heartbeats among cluster nodes for failure 
detection. All of the work on a failing node goes to a 
single surviving node. In addition to spare adapters and 

20 access to the same application data, the designated 

failover node must also have sufficient spare processing 
capacity to handle the entire failing node workload with 
"acceptable" service characteristics (response and 
throughput) . 

25 Automatic restart of failing applications also 

provides faster recovery of a failing application or 
node. This may be acceptable when the application can be 
restarted in place, but is less useful when the 
application is moved to another node, unless the IP 

3 0 address known to the clients can be moved with the 

application, or dynamic DNS updates with alternate IP 
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addresses can be propagated to a DNS local to clients 
sufficiently quickly. 

Other attempts at error recovery have included the 
EDDIE system described in a paper titled "EDDIE, A Robust 
5 and Scalable Internet Server" by A. Dahlin, M. Froberg, 

J. Grebeno, J. Walerud, and P. Winroth, of Ericsson 
Telecom AB, Stockholm, Sweden, May 1998. In the EDDIE 
approach, a distributed application called "IP Address 
Migration Application" controls all IP addresses in the 

10 cluster. The cluster is connected via a shared-medium 

LAN. IP address aliasing is used to provide addresses to 
individual applications over a single adapter, and these 
aliases are located via the Address Resolution Protocol 
(ARP) and ARP caches in the TCP/IPs. The application 

15 monitors all server applications and hardware, and 

reallocates aliased IP addresses, in the event of 
failure, to surviving adapters and nodes. This approach 
allows applications of a failing node to be distributed 
among surviving nodes, but it may require the monitoring 

2 0 application to have complete knowledge of the application 

and network adapter topology in the cluster. In this 
sense, it is similar to existing Systems Management 
applications such as those provided by International 
Business Machines Corporation's Tivoli® network 

25 management software, but the IP Address Migration 

Application has direct access to adapters and ARP caches. 
The application also requires a dedicated IP address for 
inter-application communication and coordination. 
United States Patent Application Serial No. 

30 09/401,419 entitled "METHODS, SYSTEMS AND COMPUTER 

PROGRAM PRODUCTS FOR AUTOMATED MOVEMENT OF IP ADDRESSES 
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WITHIN A CLUSTER" filed September 22, 1999, the 
disclosure of which is incorporated herein by reference 
as if set forth fully herein, describes dynamic virtual 
IP addresses (VIPA) and their use. As described in the 
5 '419 application, a dynamic VIPA may be automatically 

moved from protocol stack to protocol stack in a 
predefined manner to overcome failures of a particular 
protocol stack (i.e. VIPA takeover). Such a predefined 
movement may provide a predefined backup protocol stack 

10 for a particular VIPA, VIPA takeover was made available 

by International Business Machines Corporation (IBM) , 
Armonk, NY, in System/3 90 V2R8 which had a general 
availability date of September, 1999. 

In addition to failure scenarios, scalability and 

15 load balancing are also issues which have received 

considerable attention in light of the expansion of the 
Internet. For example, it may be desirable to have 
multiple servers servicing customers. The workload of 
such servers may be balanced by providing a single 

2 0 network visible IP address which is mapped to multiple 

servers . 

Such a mapping process may be achieved by, for 
example, network address translation (NAT) facilities, 
dispatcher systems and IBM's Dynamic Name Server /Workload 
25 Management DNS/WLM systems. These various mechanisms for 

allowing multiple servers to share a single IP address 
are illustrated in Figures 1 through 3. 

Figure 1 illustrates a conventional network address 
translation system as described above. In the system of 

3 0 Figure 1, a client 10 communicates over a network 12 to a 

network address translation system 14. The network 
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address translation system receives the communications 
from the client 10 and converts the communications from 
the addressing scheme of the network 12 to the addressing 
scheme of the network 12 • and sends the messages to the 
5 servers 16. A server 16 may be selected from multiple 

servers 16 at connect time and may be on any host^ one or 
more hops away. All inbound and outbound traffic flows 
through the NAT system 14 . 

Figure 2 illustrates a conventional DNS/WLM system 

10 as described above. As mentioned above, the server 16 is 

selected at name resolution time when the client 10 
resolves the name for the destination server from DNS/WLM 
system 17 which is connected to the servers 16 through 
the coupling facility 19 and to the network 12. As 

15 described above, the DNS/WLM system of Figure 2 relies on 

the client 10 adhering to the zero time to live. 

Figure 3 illustrates a conventional dispatcher 
system. As seen in Figure 3, the client 10 communicates 
over the network 12 with a dispatcher system 18 to 

2 0 establish a connection. The dispatcher routes inbound 

packets to the servers 16 and outbound packets are sent 
over network 12 ' but may flow over any available path to 
the client 10. The servers 16 are typically on a 
directly connected network to the dispatcher 18 and a 
25 server 16 is selected at connect time. 

Such a dispatcher system is illustrated by the 
Interactive Network Dispatcher function of the IBM 2216 
and AIX platforms. In these systems, the same IP address 
that the Network Dispatcher node 18 advertises to the 

3 0 routing network 12 is activated on server nodes 16 as a 

loopback addresses. The node performing the distribution 
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function connects to the endpoint stack via a single hop 
connection because normal routing protocols typically 
cannot be used to get a connection request from the 
endpoint to the distributing node if the endpoint uses 
5 the same IP address as the distributing node advertises. 

Network Dispatcher uses an application on the server to 
query a workload management function (such as WLM of 
System/390) , and collects this information at intervals^ 
e.g, 3 0 seconds or so. Applications running on the 

10 Network Dispatcher node can also issue "null" queries to 

selected application server instances as a means of 
determining server instance health. 

In addition to the above described systems^ Cisco 
Systems offers a Mult i -Node Load Balancing function on 

15 certain of its routers that perform the distribution 

function. Such operations appear similar to those of the 
IBM 2216. 

In addition to the system described above, 
AceDirector from Alteon provides a virtual IP address and 

20 performs network address translation to a real address of 

a selected server application. AceDirector appears to 
observe connection request turnaround times and rejection 
as a mechanism for determining server load capabilities. 
A still further consideration which has arisen as a 

25 result of increased use of the Internet is security. 

Recently, the Internet has seen an increase in use of 
Virtual Private Networks which utilize the Internet as a 
communications media but impose security protocols onto 
the Internet to provide secure communications between 

30 network hosts. Typically, these security protocols are 

intended to provide "end-to-end" security in that secure 
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communications are provided for the entire communications 
path between two host processing systems. However, 
Internet security protocols, which are typically intended 
to provide "end-to-end" security between a source IP 

5 address and a destination IP address, may present 

difficulties for load balancing and failure recovery. 

As an example, the Internet Protocol Security 
Architecture (IPSec) , is a Virtual Private Network (VPN) 
technology that operates on the network layer (layer 3) 

0 in conjunction with an Internet Key Exchange (IKE) 

protocol component that operates at the application layer 
(layer 5 or higher) . IPSec uses symmetric keys to secure 
traffic between peers. These symmetric keys are 
generated and distributed by the IKE function. IPSec 

5 uses security associations (SAs) to provide security 

services to traffic. SAs are unidirectional logical 
connections between two IPSec systems which may be 
uniquely identified by the triplet of <Security Parameter 
Index, IP Destination Address, Security Protocol >. To 

0 provide bidirectional communications, two SAs are 

defined, one in each direction. 

SAs are managed by IPSec systems maintaining two 
databases; a Security Policy Database (SPD) and a 
Security Associations Database (SAD) . The SPD specifies 

5 what security services are to be offered to the IP 

traffic. Typically, the SPD contains an ordered list of 
policy entries which are separate for inbound and 
outbound traffic. These policies may specify, for 
example, that some traffic must not go through IPSec 

0 processing, some traffic must be discarded and some 

traffic must be IPSec processed. 
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The SAD contains parameter information about each 
SA. Such parameters may include the security protocol 
algorithms and keys for Authentication Header (AH) or 
Encapsulating Security Payload (ESP) security protocols, 
5 sequence numbers, protocol mode and SA lifetime. For 

outbound processing, an SPD entry points to an entry in 
the SAD. In other words, the SPD determines which SA is 
to be used for a given packet. For inbound processing, 
the SAD is consulted to determine how the packet is 

10 processed. 

As described above, IPSec provides for two types of 
security protocols. Authentication Header (AH) and 
Encapsulating Security Payload (ESP) . AH provides origin 
authentication for an IP datagram by incorporating an AH 

15 header which includes authentication information. ESP 

encrypts the payload of an IP packet using shared secret 
keys. A single SA may be either AH or ESP but not both. 
However, multiple SAs may be provided with differing 
protocols. For example, two SAs could be established to 

20 provide both AH and ESP protocols for communications 

between two hosts. 

IPSec also supports two modes of SAs; transport mode 
and tunnel mode. In transport mode, an IPSec header is 
inserted into the IP header of the IP datagram. In the 

25 case of ESP, a trailer and optional ESP authentication 

data are appended to the end of the original payload. In 
tunnel mode, a new IP datagram is constructed and the 
original IP datagram is made the payload of the new IP 
datagram. IPSec in transport mode is then applied to the 

3 0 new IP datagram. Tunnel mode is typically used when 

either end of a SA is a gateway. 
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SAs are negotiated between the two endpoints of the 
SA and may, typically, be established through prior 
negotiations or dynamically. IKE may be utilized to 
negotiate a SA utilizing a two phase negotiation. In 
5 phase 1, an Internet Security Association and Key 

Management Protocol (ISAKMP) security association is 
established. It is assumed that a secure channel does 
not exist and, therefore, one is established to protect 
the ISAKMP messages. This security association in owned 

10 by ISAKMP. During phase 1, the partners exchange 

proposals for the ISAKMP security association and agree 
on one. The partners then exchange information for 
generating a shared master secret. Both parties then 
generate keying material and shared secrets before 

15 exchanging additional authentication information. 

In phase 2, subsequent security associations for 
other services are negotiated. The ISAKMP security 
association is used to negotiate the subsequent SAs. In 
phase 2, the partners exchange proposals for protocol SAs 

20 and agree on one. To generate keys, both parties use the 

keying material from phase 1 and may, optionally, perform 
additional exchanges. Multiple phase 2 exchanges may be 
provided under the same phase 1 protection. 

Once phase 1 and phase 2 exchanges have successfully 

25 completed, the peers have reached a state where they can 

start to protect traffic with IPSec according to 
applicable policies and traffic profiles. The peers 
would then have agreed on a proposal to authenticate each 
other and to protect future IKE exchanges, exchanged 

3 0 enough secret and random information to create keying 

material for later key generation, mutually authenticated 
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the exchange, agreed on a proposal to authenticate and 
protect data traffic with IPSec, exchanged further 
information to generate keys for IPSec protocols, 
confirmed the exchange and generated all necessary keys. 
5 With IPSec in place, for host systems sending 

outbound packets, the SPD is consulted to determine if 
IPSec processing is required or if other processing or 
discarding of the packet is to be performed. If IPSec is 
required, the SAD is searched for an existing SA for 

10 which the packet matches the profile. If no SA is found, 

a new IKE negotiation is started that results in the 
desired SA being established. If an SA is found or after 
negotiation of an SA, IPSec is applied to the packet as 
defined by the SA and the packet is delivered. 

15 For packets inbound to a host system, if IPSec is 

required, the SAD is searched for an existing security 
parameter index to match the security parameter index of 
the inbound packet. If no match is found the packet is 
discarded. If a match is found, IPSec is applied to the 

2 0 packet as required by the SA and the SPD is consulted to 

determine if IPSec or other processing is required. 
Finally, the payload is delivered to the local process. 

In light of the above discussion, various of the 
workload distribution methods described above may have 
25 compatibility problems with IPSec. 

Summary of the Invention 

Embodiments of the present invention provide 
methods, systems and computer program products for 

3 0 providing Internet Protocol Security (IPSec) to a 

plurality of target hosts in a cluster of data processing 
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systems which communicate with a network through a 
routing communication protocol stack utilizing a 
dynamically rout able Virtual Internet Protocol Address 
(DVIPA) by negotiating security associations (SAs) 
5 associated with the DVIPA utilizing an Internet Key 

Exchange (IKE) component associated with the routing 
communication protocol stack and distributing information 
about the negotiated SAs to the target hosts so as to 
allow the target hosts to perform IPSec processing of 
10 communications from the network utilizing the negotiated 

SAs. 

In particular embodiments of the present invention, 
the routing communication protocol stack receives a 
communication from the network determines if the 
15 communication is an IPSec communication to the DVIPA and 

routes the received communication to one of the target 
hosts associated with the DVIPA. The determination of 
whether the communication is an IPSec communication may 
be accomplished by evaluating a destination address in 

2 0 the TCP header of a received datagram of the 

communication and determining if the destination address 
is a DVIPA. Additionally, the evaluation of a 
destination address may be preceded by determining if the 
destination address is encrypted and decrypting the 
25 received communication utilizing an SA associated with 

the IPSec communication to an end of a Transmission 
Control Protocol (TCP) header of the datagram. The 
determination of the end of the TCP header may be based 
on whether the IPSec SA is in transport mode or tunnel 

3 0 mode . 

In still further embodiments of the present 
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invention, the routing communication protocol stack 
bypasses inbound filtering if the communication is an 
IPSec communication to a DVIPA. Alternatively, the 
routing communication protocol stack inbound filters the 
5 communication if the communication is an IPSec 

communication, encapsulates the filtered inbound 
communication in a generic routing format and routes the 
encapsulated communication to the one of the target 
hosts. The communication protocol stack of the one of 

10 the target hosts bypasses inbound filtering the routed 

encapsulated communication and decapsulates the 
encapsulated communication . 

In additional embodiments of the present invention, 
the routing communication protocol stack performs a 

15 tunnel check of the received communication and rejects 

the communication so as to not route the received 
communication to the one of the target hosts based on the 
results of the tunnel check. Also, the routing 
communication protocol stack may perform a replay 

2 0 sequence number check on the received communication and 

reject the communication so as to not route the received 
communication to the one of the target hosts based on the 
results of the replay sequence number check. 

In additional embodiments of the present invention, 

2 5 the communications are routed by selecting a target host 

from the plurality of target hosts based on entries in a 
distributed connection table associated with the DVIPA 
and sending the received communication to the selected 
target host over a cross -coupling facility (XCF) link. 

3 0 In still further embodiments of the present 

invention, the distributed SAs are stored in a shadow 
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cache of communication protocol stacks of the target 
hosts. IPSec processing of communications to the DVIPA 
utilize the SAs in the shadow cache. Additionally, 
logging of information may occur at the communication 
5 host performing the IPSec processing of the received 

communication. 

In particular embodiments of the present invention, 
an inbound lifesize count may be provided to the routing 
communication protocol stack. In such embodiments, the 

10 IKE refreshes the SAs associated with the DVIPA based on 

the inbound lifesize count provided by the communication 
protocol stacks of the target hosts. The inbound lifesize 
count may be provided by sending cross coupling facility 
(XCF) messages containing the inbound lifesize count to 

15 the routing communication protocol stack. The XCF 

messages may be periodically sent to the routing 
communication protocol stack such that one message 
represents the lifesize count for a plurality of IPSec 
processed communications. The plurality of IPSec 

20 processed communications may be a percentage of a total 

lifesize count associated with an SA. The percentage of 
the total lifesize may be dynamically based on whether 
the IKE has previously refreshed the SA prior to 
expiration of the lifesize count associated with the SA. 

25 

In still further embodiments of the present invention, 
Internet Protocol Security (IPSec) is provided to a 
plurality of target hosts in a cluster of data processing 
systems by a shadow SA cache at each of the target hosts 
30 which is configured to store security association (SA) 

information associated with a dynamically routable 
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Virtual Internet Protocol Address (DVIPA) and a 
communication protocol stack at each of the target hosts 
configured to IPSec process datagrams associated with the 
DVIPA utilizing the SA informaton in the shadow SA cache. 
5 In further embodiments, a routing communication 

protocol stack is configured to provide communications 
for the plurality of target hosts with a network 
utilizing the dynamically routable Internet Protocol 
Address (DVIPA) . An Internet Key Exchange module is 

10 associated with the routing communication protocol stack. 

The routing communication protocol stack is further 
configured to distribute security association (SA) 
information for IPSec SAs negotiated by the iKE and 
associated with the DVIPA to the communication protocol 

15 stack at each of the target hosts. The communication 

protocol stack at each of the target hosts are configured 
to store the IPSec SA information in the shadow SA cache. 

The routing communication protocol stack may be 
further configured to decrypt the Transmission Control 

20 Protocol (TCP) header of received IPSec encapsulated 

datagrams to determine if the received datagram is 
associated with a DVIPA. 

The communication protocol stack at each of the 
target hosts may be configured to update a lifesize count 

25 of the IKE associated with IPSec processed datagrams. 

As will further be appreciated by those of skill in 
the art, the present invention may be embodied as 
methods, apparatus/ systems and/or computer program 
products . 

3 0 Brief Description of the Drawings 
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Figure 1 is block diagram of a conventional network 
address translation system; 

Figure 2 is block diagram of a conventional DNS/WLM 
system; 

5 Figure 3 is block diagram of a conventional 

dispatcher system; 

Figure 4 is a block diagram of a cluster of data 
processing systems including Sysplex Distributor which 
incorporate embodiments of the present invention; 
10 Figure 5 is a flowchart illustrating operations 

according to embodiments of the present invention; 

Figure 6 is a flowchart illustrating operations of a 
routing communication protocol stack for distribution of 
IPSec SAs according to embodiments of the present 
15 invention; 

Figure 7 is a flowchart illustrating operations of a 
target host communication protocol stack for processing 
shadow SA information according to embodiments of the 
present invention; 
20 Figure 8 is a flowchart illustrating operations of a 

routing communication protocol stack for processing 
inbound communications according to embodiments of the 
present invent ion ; 

Figure 9 is a flowchart illustrating operations of a 
25 target host communication protocol stack for processing 

inbound communications according to embodiments of the 
present invent ion ; 

Figure 10 is a flowchart illustrating operations of 
a target host communication protocol stack for processing 
3 0 outbound communications according to embodiments of the 

present invention; and 
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Figure 11 is a flowchart illustrating operations for 
establishing a target host originated connection 
according to embodiments of the present invention. 



5 Detailed Description of the Invention 

The present invention now will be described more 
fully hereinafter with reference to the accompanying 
drawings, in which preferred embodiments of the invention 
are shown. This invention may, however, be embodied in 

10 many different forms and should not be construed as 

limited to the embodiments set forth herein; rather, 
these embodiments are provided so that this disclosure 
will be thorough and complete, and will fully convey the 
scope of the invention to those skilled in the art. Like 

15 numbers refer to like elements throughout. 

As will be appreciated by those of skill in the art, 
the present invention can take the form of an entirely 
hardware embodiment, an entirely software (including 
firmware, resident software, micro-code, etc.) 

2 0 embodiment, or an embodiment containing both software and 

hardware aspects. Furthermore, the present invention can 
take the form of a computer program product on a 
computer-usable or computer -readable storage medium 
having computer-usable or computer- readable program code 
25 means embodied in the medium for use by or in connection 

with an instruction execution system. In the context of 
this document, a computer-usable or computer- readable 
medium can be any means that can contain, store, 
communicate, propagate, or transport the program for use 

3 0 by or in connection with the instruction execution 

system, apparatus, or device. 



-19- 



RSW920000135US1 



The computer-usable or computer- readable medium can 
be, for example, but is not limited to, an electronic, 
magnetic, optical, electromagnetic, infrared, or 
semiconductor system, apparatus, device, or propagation 
5 medium. More specific examples {a nonexhaustive list) of 

the computer-readable medium would include the following: 
an electrical connection having one or more wires, a 
removable computer diskette, a random access memory 
(RAM) , a read-only memory (ROM) , an erasable programmable 

10 read-only memory (SPROM or Flash memory) , an optical 

fiber, and a portable compact disc read-only memory (CD- 
ROM) . Note that the computer-usable or computer- readable 
medium could even be paper or another suitable medium 
upon which the program is printed, as the program can be 

15 electronically captured, via, for instance, optical 

scanning of the paper or other medium, then compiled, 
interpreted, or otherwise processed in a suitable manner 
if necessary, and then stored in a computer memory. 

The present invention can be embodied as systems, 

20 methods, or computer program products which allow for 

end-to-end network security to be provided in a cluster 
of data processing systems which utilize a common IP 
address and have workload utilizing the common IP address 
distributed to data processing systems in the cluster. 

25 Such secure network communications may be provided by 

distributing security associations to target hosts so as 
to allow the target hosts to process secure distributed 
communications. By distributing computationally intense 
security operations, processing load may be better 

30 distributed within a cluster of data processing systems. 
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Embodiments of the present invention will now be 
described with reference to Figures 4 through 11 which 
are flowchart and block diagram illustrations of 
operations of protocol stacks incorporating embodiments 
5 of the present invention. It will be understood that each 

block of the flowchart illustrations and/or block 
diagrams, and combinations of blocks in the flowchart 
illustrations and/or block diagrams, can be implemented 
by computer program instructions. These program 

10 instructions may be provided to a processor to produce a 

machine, such that the instructions which execute on the 
processor create means for implementing the functions 
specified in the flowchart and/or block diagram block or 
blocks. The computer program instructions may be 

15 executed by a processor to cause a series of operational 

steps to be performed by the processor to produce a 
computer implemented process such that the instructions 
which execute on the processor provide steps for 
implementing the functions specified in the flowchart 

2 0 and/or block diagram block or blocks. 

Accordingly, blocks of the flowchart illustrations 
and/or block diagrams support combinations of means for 
performing the specified functions, combinations of steps 
for performing the specified functions and program 
25 instruction means for performing the specified functions. 

It will also be understood that each block of the 
flowchart illustrations and/or block diagrams, and 
combinations of blocks in the flowchart illustrations 
and/or block diagrams, can be implemented by special 

3 0 purpose hardware-based systems which perform the 
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specified functions or steps, or combinations of special 
purpose hardware and computer instructions. 

Particular exemplary embodiments of the present 
invention, are illustrated in Figure 4 which illustrates 
5 a Sysplex cluster utilizing Sysplex Distributor. The 

Sysplex Distributor was provided in OS/3 90 V2R10 (General 
Availability of September, 1999) and is described in 
detail in commonly assigned United States Patent 
Application Serial No. 09/640,409, entitled "METHODS, 

10 SYSTEMS AND COMPUTER PROGRAM PRODUCTS FOR CLUSTER 

WORKLOAD DISTRIBUTION" (Attorney Docket No. 5577-205), 
Unites States Patent Application Serial No. 09/640,412, 
entitled "METHODS, SYSTEMS AND COMPUTER PROGRAM PRODUCTS 
FOR NON-DISRUPTIVELY TRANSFERRING A VIRTUAL INTERNET 

15 PROTOCOL ADDRESS BETWEEN COMMUNICATION PROTOCOL STACKS" 

(Attorney Docket No. 5577-207) and United States Patent 
Application Serial No. 09/640,438, entitled "METHODS, 
SYSTEMS AND COMPUTER PROGRAM PRODUCTS FOR FAILURE 
RECOVERY FOR ROUTED VIRTUAL INTERNET PROTOCOL ADDRESSES" 

20 (Attorney Docket No. 5577-206) , the disclosures of which 

are incorporated herein by reference as if set forth 
fully herein. 

Sysplex Distributor systems have previously 
attempted to provide IPSec support by having the IPSec 

2 5 endpoint be a distributing host and a TCP endpoint be the 

target hosts. However, such a difference may lead to 
security complexities and inconsistencies within the 
Sysplex. Such inconsistencies have lead to OS/390 V2R10 
requiring that all IPSec traffic not be distributed, thus 

3 0 depriving IPSec traffic of the benefits of distribution 
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provided by the systems of the above referenced patent 
applications . 

In Sysplex Distributor^ a single IP address is 
associated with a plurality of communication protocol 
5 stacks in a cluster of data processing systems by 

providing a routing protocol stack which associates a 
Virtual IP Address (VIPA) and port with other 
communication protocol stacks in the cluster and routes 
communications to the VIPA and port to the appropriate 

10 communication protocol stack. VIPAs capable of being 

shared by a number of communication protocol stacks are 
referred to herein as "dynamic routable VIPAs" (DVIPA) . 
While the present invention is described with reference 
to specific embodiments in a System/3 90 Sysplex, as will 

15 be appreciated by those of skill in the art, the present 

invention may be utilized in other systems where clusters 
of computers utilize virtual addresses by associating an 
application or application group rather than a particular 
communications adapter with the addresses. Thus, the 

2 0 present invention should not be construed as limited to 

the particular exemplary embodiments described herein. 

A cluster of data processing systems is illustrated 
in Figure 4 as a cluster of nodes in a Sysplex 10. As 
seen in Figure 4, several data processing systems 20, 24, 
25 28, 32 and 36 are interconnected in the Sysplex 10. The 

data processing systems 20, 24, 28, 32 and 36 illustrated 
in Figure 4 may be operating system images, such as MVS 
images, executing on one or more computer systems. While 
the present invention will be described primarily with 

3 0 respect to the MVS operating system executing in a 

System/390 environment, the data processing systems 20, 
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24, 28, 32 and 36 may be mainframe computers, mid-range 
computers, servers or other systems capable of supporting 
dynamic routable Virtual IP Addresses as described 
herein. 

5 As is further illustrated in Figure 4, the data 

processing systems 20, 24, 28, 32 and 36 have associated 
with them communication protocol stacks 22, 26, 30, 34 
and 38, which may be TCP/IP stacks. The communication 
protocol stacks 26, 30, and 34 have been modified to 

10 incorporate shadow SA caches 23', and 27* as described 

herein for providing security processing of dynamically 
routed VIPAs at the target host. Also illustrated in 
Figure 4 is an SA cache 23 associated with the IKE of MVS 
1, an SA cache 25 associated with the IKE of MVS 4 and an 

15 SA cache 27 associated with the IKE of MVS 5. Such SA 

caches may be used with data processing systems having an 
IKE to negotiate SAs such that SAs owned by the IKE may 
be stored in the SA cache. For example, SAs negotiated 
by a local instance of IKE on MVS 1 may be stored in the 

20 SA cache 23. If the SAs are associated with a 

dynamically routed VIPA the SAs may also be stored in the 
shadow SA cache 23' at potential target hosts for the 
DVIPA. For example, in Figure 4, MVS 2 and MVS 4 are 
associated with the DVIPA of routing communication 

25 protocol stack 22 and, therefore, include the shadow SA 

cache 23*. Similarly, MVS 2 and MVS 3 are associated 
with the DVIPA of routing communication protocol stack 38 
and, therefore, include the shadow SA cache 27'. 

The shadow SA caches 23' and 27 » may be organized by 

3 0 DVIPA and may contain SA information for connections for 

which the target hosts is an endpoint . For inbound IPSec 
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traffic, the shadow SA caches 23' and 27' may be accessed 
by Security Parameter Index (SPI) , IPSec protocol and 
source address. For outbound traffic, the shadow SA 
caches 23' and 27' may be accessed by Tunnel Name, which 
5 may be determined by conventional filter rule lookup on 

the target host. 

As seen in Figure 4, not all communication protocol 
stacks in a Sysplex need incorporate both the SA cache 23 
and the shadow SA cache 23'. For example, if a 

10 communication protocol stack does not support IPSec, the 

communication protocol stack would not require either 
cache. Similarly, if the communication protocol stack 
does not support IPSec for distributed communications as 
described herein, the communication protocol stack would 

15 not require the shadow cache 23'. 

As is further seen in Figure 4, the communication 
protocol stacks 22, 26, 30, 34 and 3 8 may communicate 
with each other through a coupling facility 40 of the 
Sysplex 10, for example, utilizing XCF messaging. 

2 0 Furthermore, the communication protocol stacks 22 and 3 8 

may communicate with an external network 44 such as the 
Internet, an intranet, a Local Area Network (LAN) or Wide 
Area Network (WAN) utilizing the Enterprise System 
Connectivity (ESCON) 42. Thus, a client 46 may utilize 

25 the network 44 to communicate with an application 

executing on an MVS image in Sysplex 10 through the 
communication protocol stacks 22 and 3 8 which may 
function as routing protocol stacks as described herein. 

As is further illustrated in Figure 4, as an example 

30 of utilization of the present invention and for 

illustration purposes, communication protocol stack 22 
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which is associated with MVS image MVS 1 which has 
application APP A executing on MVS image MVS 1 and 
utilizing communication protocol stack 22 to allow access 
to, for example, client 46 through network 44. 
5 Furthermore, the communication protocol stack 22 is 

capable of IPSec processing, managing and accessing the 
SPD and SAD. MVS image MVS 1 also has an instance of the 
IKE application executing to allow negotiation of IPSec 
SAs. 

10 Similarly, communication protocol stack 26 is 

associated with MVS image MVS 2 which has a second 
instance of application APP A and an instance of 
application APP B executing on MVS image MVS 2 which may 
utilize communication protocol stack 26 for 

15 communications. Communication protocol stack 30 which is 

associated with MVS image MVS 3 has a second instance of 
application APP B executing on MVS image MVS 3 which may 
utilize communication protocol stack 30 for 
communications. Communication protocol stack 34 which is 

20 associated with MVS image MVS 4 has a third instance of 

application APP A executing on MVS image MVS 4 which may 
utilize communication protocol stack 34 for 
communications. MVS image MVS 4 may also have an instance 
of the IKE application executing to allow negotiation of 

25 IPSec SAs. 

Finally, communication protocol stack 38 which is 
associated with MVS image MVS 5 has a third instance of 
application APP B executing on MVS image MVS 5 which may 
utilize communication protocol stack 38 for 

30 communications. Furthermore, the communication protocol 

stack 38 is capable of IPSec processing, managing and 
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accessing the SPD and SAD. iVIVS image MVS 5 also has an 
instance of the IKE application executing to allow 
negotiation of IPSec SAs . 

VIPA Distributor allows for protocol stacks which 
5 are defined as supporting DVIPAs to share the DVIPA and 

communicate with network 44 through a routing protocol 
stack such that all protocol stacks having a server 
application which is associated with the DVIPA will 
appear to the network 44 as a single IP address. Such 

10 dynamically routable VIPAs may be provided by designating 

a protocol stack, such as protocol stack 22, as a routing 
protocol stack. Other protocol stacks are notified of the 
routing protocol stack. The other protocol stacks then 
notify the routing protocol stack when an application 

15 which binds to the DVIPA is started. Such routing 

protocol stacks may also provide IPSec processing for the 
DVIPA as described in commonly assigned and concurrently 

filed United States Patent Application Serial No. , 

entitled "METHODS, SYSTEMS AND COMPUTER PROGRAM PRODUCTS 

2 0 FOR PROVIDING FAILURE RECOVERY OF NETWORK SECURE 

COMMUNICATIONS IN A CLUSTER COMPUTING ENVIRONMENT" 
(Attorney Docket No. 5577-221) , United States Patent 

Application Serial No. , entitled "METHODS, SYSTEMS 

AND COMPUTER PROGRAM PRODUCTS FOR PROVIDING DATA FROM 

2 5 NETWORK SECURE COMMUNICATIONS IN A CLUSTER COMPUTING 

ENVIRONMENT" {Attorney Docket No. 5577-22 0) and United 

States Patent Application Serial No. , entitled 

"METHODS, SYSTEMS AND COMPUTER PROGRAM PRODUCTS FOR 
TRANSFERRING SECURITY PROCESSING BETWEEN PROCESSORS IN A 

30 CLUSTER COMPUTING ENVIRONMENT" (Attorney Docket No. 5577- 
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216) , the disclosures of which are incorporated herein by 
reference as if set forth fully herein. 

The communication protocol stacks 22 , 26, 30, 34 and 
38 may be configured as to which stacks are routing 
5 stacks, backup routing stacks and server stacks. 

Different DVIPAs may have different sets of backup 
stacks, possibly overlapping. The definition of backup 
stacks may be the same as that for the VI PA takeover 
function described in United States Patent Application 

10 Serial No. 09/401,419, entitled "METHODS, SYSTEMS AND 

COMPUTER PROGRAM PRODUCTS FOR AUTOMATED MOVEMENT OF IP 
ADDRESSES WITHIN A CLUSTER" which is incorporated herein 
by reference as if set forth fully herein. 

Configuration of a dynamic rentable VIPA may be 

15 provided by a definition block established by a system 

administrator for each routing communication protocol 
stack 22 and 38. Such a definition block is described in 
the above referenced United States Patent Applications 
and defines dynamic rentable VIPAs for which a 

2 0 communication protocol stack operates as the primary 

communication protocol stack. Backup protocol stacks may 
be defined as described in the above referenced 
applications with reference to the VIPA takeover 
procedure. Thus, the definition block "VIPADynamic" may 

25 be used to define dynamic rentable VIPAs. Within the 

VIPADynamic block, a definition may also be provided for 
a protocol stack supporting moveable VIPAs. All of the 
VIPAs in a single VIPADEFine statement may belong to the 
same subnet, network, or supernet, as determined by the 

30 network class and address mask. VIPAs may also be defined 
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as moveable VIPAs which may be transferred from one 
communication protocol stack to another. 

Similarly, within the definitions, a protocol stack 
may be defined as a backup protocol stack and a rank { 
5 (e.g. a number between 1 and 254) provided to determine 

relative order within the backup chain (s) for the 
associated dynamic routable VIPA(s) . A communication 
protocol stack with a higher rank will take over the 
dynamic VIPAs before a communication protocol stack with 

10 a lower rank. 

Within the VIPADYNamic block, a VI PA may be defined 
as a dynamic routable VIPA based on a VIPA address and a 
portlist which is a list of ports for which the DVIPA 
will apply. Alternatively, all ports for an IP address 

15 may be considered as DVIPAs. Also provided in the 

definition is a list of protocol stacks which will be 
included as server stacks in routing communications 
directed to the DVIPA. The IP addresses which define the 
potential server stacks may be XCF addresses of the 

20 protocol stacks or may be designated "ALL." If "ALL" is 

designated, then all stacks in the Sysplex are candidates 
for distribution. In addition to the above definitions, a 
range of IP addresses may be defined as DVIPAs utilizing 
the VIPARange definition. 

25 Returning to the example of Figure 4 and utilizing 

the definitions described in the above referenced 
applications, for MVSl to MVS5, the VIPADEFine statements 
may be : 

MVSl -.VIPADEFine MOVEable IMMEDiate DVA1 
3 0 VIPADISTribute DVA1 PORT 60 DESTIP XCF1 , XCF2, XCF4 

MVS5:VIPADEFine MOVEable IMMEDiate DVB1 
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VIPADISTribute DVB1 PORT 60 DESTIP ALL 
VIPADISTribute DVA1 PORT 60 DESTIP XCF2, XCF3, XCF4 

For purposes of illustration, the respective address 
masks have been omitted because they are, typically, only 
5 significant to the routing daemons. In the above 

illustration, XCFl is an XCF address of the TCP/IP stack 
on MVSl, XCF2 is an XCF address of the TCP/IP stack on 
MVS2 and XCF3 is an XCF address of the TCP/IP stack on 
MVS4 . Note that, for purposes of the present example, 

10 definitions for MVS2 , MVS3, and MVS4 are not specified. 

Such may be the case where the protocol stacks for these 
MVS images are candidate target protocol stacks and are 
not identified as routing protocol stacks and, therefore, 
receive their dynamic routable VIPA definitions from the 

15 routing protocol stacks. Additional VIPA definitions may 

also be provided, however, in the interests of clarity, 
such definitions have been omitted. 

Utilizing the above described system configuration 
as an example, distributed IPSec processing will now be 

2 0 described with reference to Figures 5 through 11. Figure 

5 illustrates operations according to embodiments of the 
present invention. As seen in Figure 5, an IKE 
associated with a routing communication protocol stack 
negotiates an SA for the DVIPA routed by the protocol 
25 stack (block 100) . This SA information is then 

distributed to potential target hosts of the DVIPA over a 
trusted communication link (block 102) . IPSec processing 
for the DVIPA is performed using the distributed 
information at the target hosts (block 104) . 

3 0 Operations of a routing communication protocol stack 

according to embodiments of the present invention when an 
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IPSec SA is negotiated, such as the protocol stacks 22 
and 38 in Figure 4 in the present example, are 
illustrated in Figure 6. As seen in Figure 6, when an 
IPSec SA is negotiated, IKE installs the SA in the 
5 corresponding SA caches 23 and 27 of the corresponding 

routing communication protocol stacks 22 and 38 (block 
112) . 

If the installed SA information is not for a DVIPA 
(block 114) , operations proceed normally. However, if 

10 the installed SA information is for a DVIPA (block 114) , 

the Phase 2 sequence number of the installed SA is stored 
in the coupling facility (block 116) and the SA and 
dynamic filter information is sent to the target hosts 
for the DVIPA (block 118) . Such a distribution may be 

15 carried out by, for example, broadcasting the information 

utilizing XCF communications. 

In the present example, when an SA associated with 
DVAl is installed in the SA cache 23 of the routing 
communication protocol stack 22, the SA information would 

2 0 be distributed to the communication protocol stacks 26 

and 34. When an SA associated with DVBl was installed in 
the SA cache 27 of the routing communication protocol 
stack 38, the SA information would be distributed to the 
communication protocol stacks 26 and 30. 

25 As seen in Figure 7, the communication protocol 

stacks receive the SA information associated with a DVIPA 
(block 120) and install the SA information in their 
corresponding shadow SA cache 23' and 27' and the dynamic 
filter rules in the policy filter (block 122) . The 

30 shadow SA caches 23" and 27* are used as described herein 

for storing SA information which is not "owned" by an IKE 
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associated with a communication protocol stack. Thus, 
for example, the SA information received by the 
communication protocol stacks of target hosts is placed 
into the shadow SA cache 23' or 27' corresponding to the 
5 DVlPA associated with the SA information because the SA 

information was not negotiated by an IKE function 
associated with the target host but was negotiated by an 
IKE function associated with the routing communication 
protocol stack for the DVIPA. 

10 Figure 8 illustrates operations for routing inbound 

communications with distributed IPSec process as 
described herein. As seen in Figure 8, the routing 
communication protocol stack receives a datagram which 
requires IPSec processing (block 130) . The routing 

15 communication protocol stack peeks at the IPSec 

information (block 132) to determine if the datagram is 
for an existing distributed connection (block 134) . Such 
a determination may be made by processing (e,g. 
decrypting and/or parsing) the datagram to the end of the 

2 0 TCP header and evaluating the information contained 

therein to determine the source and destination IP 
address and port. In particular, the peek function may 
locate an SA using SPI , Protocol and source address. If 
encrypted, decryption of the datagram is performed to the 

25 end of the TCP header. The SA mode may be used to 

determine the offset to which to decrypt or in IPv6, the 
decryption may be iterative. 

The source and destination addresses and the port 
may be used to determine if an entry exists in the 

30 routing communication protocol stack's distributed 

connection table (DCT) . If such a corresponding entry 
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exists, then the datagram may be considered to be for an 
existing distributed connection. If the routing 
communication protocol stack cannot determine if the 
datagram is for a distributed connection without further 
5 IPSec processing, then IPSec processing may be performed 

by the routing communication protocol stack as described 
in the commonly assigned and concurrently filed United 
States Patent Applications identified above. 

If the datagram is for an existing distributed 

10 connection (block 134) , a tunnel check is performed to 

verify that the datagram was received from the proper 
tunnel if the routing communication protocol stack is 
performing inbound filtering (block 136) . If the routing 
communication protocol stack is not performing inbound 

15 filtering, the tunnel check may be performed by the 

target communication protocol stack. A replay sequence 
check is also performed (block 138) . If the tunnel 
check, if performed, and the replay sequence check are 
valid, the datagram is forwarded to the target host 

2 0 (block 140) . If a tunnel check or replay sequence check 

fails, the datagram may be discarded. 

Returning to block 134, if the datagram is not for 
an existing distributed connection, the routing 
communication protocol stack determines if the datagram 
25 is a SYN message to establish a connection (block 142) . 

If not, the datagram may be processed normally (block 
146) . If the datagram is a SYN message (block 142) , the 
routing communication protocol stack determines if the 
connection is to be distributed to a target host (block 

3 0 144) . Such a determination may be made as described with 

reference to the VIPA Distributor described in the above 
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referenced United States Patent Applications. For 
example, the IPSec combinations which may be distributed 
(referred to as "Crypto Dist" in the table) may include, 
but is not limited to the following: 

5 



iPSec combination 


Mode 


Syspiex 
Dist? 


Crypto 
Dist? 


Peek Req'd 
for 

inbound 


Payload 
Type 


Local IP 
@in 
Outer 
Header 


Local IP 
@in 
Inner 
Header 


AH 


Transport 


Y 


Y 


N 


TCP 


r\\ /in A 


M/A 

N/A 


(IP,AH, Payload) 




N 


N 


N 


'^ESP 


UVIPA 


N/A 


AH 


Tunnel 


Y 


Y 


N 


1P->TCP 


DVlPA 


DVIPA 


(IP, AH, IP, Payload) 




N 


N 


N 


IP-> TCP 


DVIPA 


APVV /"in A 






Y 


N 

note 2 


N 


IP->TCP 


^DVIPA 


DVIPA 






N 


N 


N 


IP-> ^TCP, 
^ESP 


DVIPA 


^DVIPA 


ESP 


Transport 


Y 


Y 


Y 
note 1 


TCP 


DVIPA 


N/A 


(IP,ESP, Payload) 




N 


N 


Y 


^TCP 


DVIPA 


N/A 


ESP 


Tunnel 


Y 


Y 


Y 
note 1 


IP->TCP 


DVIPA 


DVIPA 


(IP,ESP, IP,Payload) 




N 


N 


Y 


IP->TCP 


DVIPA 


^DViPA 






Y 


N 

note 2 


Y 


(P->TCP 


^DViPA 


DVIPA 






N 


N 


Y 


IP-> ^TCP 


DVIPA 


>^DVIPA 


Combined AH,ESP 


Transport 


Y 


Y 


N 

note 1 


TCP 


DVIPA 


N/A 


(1P,AH,ESP, Payload) 




N 


N 


Y 


^TCP 


DVIPA 


N/A 


Combined AH,ESP 


Tunnel 


Y 


Y 


Y 
note 1 


1P->TCP 


DVIPA 


DVIPA 


(IP,AH,ESP, IP,Payload) 




N 


N 


Y 


IP->TCP 


DVIPA 


^DVIPA 






Y 


N 

note 2 


Y 


IP->TCP 


^DVIPA 


DVIPA 






N 


N 


Y 


IP-> '^TCP 


DVIPA 


^DVIPA 



25 Note 1 : If ESP null, payload is not encrypted and therefore peek is not 
needed. 

Note 2: This combination while syspiex distributable, is not expected. Support for this case, 
may involve "peeking" into non-DVlPA headers to check for distributability. 

Also indicated in the above table is whether the datagram 
30 will be "peeked" into to determine if it is distributed. 

If the SYN message is not for a distributed 
connection (block 144) , normal processing of the SYN is 
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performed (block 146) . If the SYN message is for a 
distributed connection (block 144) , then the tunnel check 

(if inbound filtering is performed) and replay sequence 
check are performed (block 136 and 138) and, if valid, 
5 the datagram is forwarded to the selected target host 

(block 140) . 

Optionally, the routing communication protocol 
stacks may also inbound filter the received datagrams. 
In such a case, inbound filtering may be bypassed for 

10 such datagrams at the target hosts. One mechanism to 

achieve such a bypass of inbound filtering is for the 
routing communication protocol stack to inbound filter 
the received datagrams and then encapsulate the received 
datagrams into a generic routing encapsulation (GRE) 

15 format before forwarding the datagrams to the target 

hosts. The source and destination address in the outer 
GRE header may be the IP addresses for the XCF links. 
Furthermore, the physical link may be matched with the IP 
addresses of the XCF links to assure that the GRE 

2 0 encapsulated datagram was received from the corresponding 

physical link. Such a verification may reduce the 
likelihood of "spoofing." In such a case, the target 
hosts may bypass inbound filtering for such GRE 
encapsulated datagrams and decapsulate the encapsulated 

2 5 datagrams for processing. 

Figure 9 illustrates operations for processing of 
inbound datagrams by the target hosts utilizing 
distributed IPSec processing. As seen in Figure 9, IPSec 
processing is performed using SA information from the 

30 shadow SA cache 23' or 27' (block 150) and the datagram 

is inbound filtered (block 152) . The inbound lifesize 
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count is also updated to reflect the received and 
processed datagram (block 154) . 

To save on communications with the coupling 
facility, the inbound lifesize count in the IKE of the 
5 routing communication protocol stack, which is used by 

the IKE of the routing communication protocol stack to 
determine when to refresh the SA, may, optionally, be 
only periodically updated on a "batch" basis. Such a 
period may be uniform or non-uniform and may be based on 

10 time, the number of processed communications, an amount 

of processed data or the like. Thus, for example, it may 
be determined if sufficient datagrams have been received 
to update the inbound lifesize count in the routing 
communication protocol stack (block 156) . If so, then 

15 the inbound lifesize count in the IKE of the routing 

communication protocol stack may be updated by sending an 
XCF message to the routing communication protocol stack 
(block 158) . Such a batch update of the lifesize count 
may be based on a percentage of the total lifesize, for 

2 0 example, an update may occur when the lifesize count in 

the target host reaches 5% of the total lifesize count. 

Alternatively, the threshold for updating the IKE of 
the routing communication protocol stack may be 
dynamically determined. For example, the threshold may 
25 be established at an initially high level and then 

decreased if the IKE was not successful in refreshing the 
SA before its expiration. Similarly, the threshold could 
be increased until the IKE was not successful or a 
maximum threshold was reached. 

3 0 Furthermore, if requested by a policy filter rule on 

the target host, logging may be performed. In particular 
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embodiments of the present invention, the policy filter 
rule need not be looked up but the binding information 
associated with the TCP connection, for example, in the 
Transmission Control Block (TCB) (which has logging 
5 information from the policy filter) , may be consulted to 

determine whether the datagram should be logged. This 
function may be performed at the TCP layer. 

Figure 10 illustrates operations of a communication 
protocol stack at a target host for processing outbound 
10 datagrams. The communication protocol stack determines 

if the datagram is for a connection utilizing distributed 
IPSec processing (block 160) . Such a determination may 
be made by detecting that the dynamic filter rule is 
associated with an SA in the shadow SA cache 23' or 27 " . 
15 If the datagram is not for a connection utilizing 

distributed IPSec processing (block 160) , the datagram 
may be processed normally. Otherwise, the communication 
protocol stack obtains the IPSec sequence number from the 
coupling facility (block 162) . 

2 0 The coupling facility provides an updated sequence 

number. If possible, the communication protocol stack 
may request multiple sequence numbers, for example, if 
multiple datagrams are ready to be processed in sequence 
for an SA. In particular, the IXLIST w/LISTKEYINC call 
25 may be used to obtain the sequence number. This call may 

provide the necessary serialization and may not require 
IPSec to know the current sequence number in the coupling 
facility. Thus, by requesting the sequence number, the 
sequence number may be automatically updated for 

3 0 subsequent datagrams by the current or other 

communication protocol stacks. 
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With the obtained sequence number, IPSec processing 
of the datagram or datagrams may be performed using the 
SA information from the shadow SA cache 23' and the 
processed datagrams may be sent to the destination (block 
164) . Note that the datagrams may be sent directly to 
the destination without going through the routing 
communication protocol stack. 

As with the inbound lifesize count, the outbound 
lifesize count of the IKE of the routing communication 
protocol stack should also be updated so that the SA may 
be refreshed in a timely manner. Thus, for example, it 
may be determined if sufficient datagrams have been 
received to update the outbound lifesize count in the 
primary (block 166) . If so, then the outbound lifesize 
count in the IKE of the routing communication protocol 
stack may be updated by sending an XCF message to the 
routing communication protocol stack (block 168) . Such a 
periodic or batch update of the outbound lifesize may 
have a uniform or non-uniform period and may be based on 
time, the number of processed communications, an amount 
of processed data or the like. Furthermore, the batch 
update may be based on a percentage of the total outbound 
lifesize, for example, an update may occur when the 
outbound lifesize count in the target host reaches 5% of 
the total outbound lifesize count. 

Alternatively, the threshold for updating the IKE of 
the routing communication protocol stack may be 
dynamically determined. For example, the threshold may 
be established at an initially high level and then 
decreased if the IKE was not successful in refreshing the 
SA before the SA expires. Similarly, the threshold could 

-38- 



RSW920000135US1 



be increased until the IKE was not successful or a 
maximum threshold was reached. 

Figure 11 illustrates the operations of the target 
host communication protocol stack for initiation of a 
5 connection utilizing distributed IPSec according to 

embodiments of the present invention. When a new 
connection is requested, the communication protocol stack 
determines if the connection is for a DVIPA (block 170) . 
Such a determination may be made based on a VIPAList 
10 distributed by the routing communication protocol stack. 

If the connection is not for a DVIPA, then operations 
continue in a conventional manner. 

If the connection is for a DVIPA (block 170) , it is 
determined if an SA exists for the connection (block 
15 172) . Such a determination may be made based on the 

contents of the shadow SA cache 23' or 27 ' . If an SA 
already exists (block 172), then operations may proceed 
as described in Figure 10, If an SA does not exist 
(block 172) , the communication protocol stack sends a 
2 0 NEWCONN message to the routing communication protocol 

stack to request that the IKE of the routing 
communication protocol stack negotiate an SA for the 
connection (block 174) . When the SA is negotiated, it is 
distributed to the communication protocol stack of the 
2 5 target host which receives the SA information and 

installs it in its shadow SA cache 23' or 27' (block 
176) . The communication protocol stack of the target 
host uses the received SA information to IPSec process 
the SYN message and send it to its destination (block 
30 178) . 
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In the embodiments described above, error logging 
may be performed by the system detecting the error. 
Furthermore, when a distributed SA is terminated, it may 
be removed from the shadow SA cache 23' or 27" by 

5 direction of the routing communication protocol stack 

associated with the IKE which negotiated the SA. Thus, 
for example, a delete SA instruction to the IKE would 
result in the SA associated with the instruction being 
removed from the shadow SA cache 23' or 27" of the target 

0 hosts. 

In the event of failure and takeover by a backup 
routing communication protocol stack or movement of 
ownership of a dynamic VI PA to a new routing 
communication protocol stack, the shadow SA cache 23 ' or 

5 27' for the dynamic VIPA may be removed and rebuilt on 

the target hosts with the newly negotiated SA's by the 
new routing communication protocol stack. 

As described herein, communications between the 
routing communication protocol stacks and the target 

0 communication protocol stacks are carried out over a 

trusted communication link and, therefore, need not use 
the network security. Such a trusted communication link 
may be provided, for example, by the interconnections 
between the data processing systems when co- located in a 

5 physically secure environment, a logically secure 

environment or both. For example, a physically secure 
environment may be provided by a local area network in a 
secure building. A logically secure environment may be 
provided by, for example, the cross coupling facility 

0 (XCF) in an OS/3 9 0 Sysplex such that the communications 

between the routing communication protocol stacks and the 
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target communication protocol stacks are provided using 
XCF links. As will be appreciated by those of skill in 
the art, the XCF links may be logically secure or both 
logically and physically secure. Similarly, encryption 
5 could also be provided for communications between data 

processing systems in the cluster such that the 
communications could be transmitted over a non-secure 
transmission media. As will be appreciated by those of 
skill in the art in light of the present disclosure, 
10 other trusted mechanisms for communications within the 

cluster of data processing systems may also be utilized. 

While the present invention has been described with 
respect to the VIPA distribution function as a part of 
the communication protocol stack, as will be appreciated 
15 by those of skill in the art, such functions may be 

provided as separate functions, objects or applications 
which may cooperate with the communication protocol 
stacks. Furthermore, the present invention has been 
described with reference to particular sequences of 
20 operations. However, as will be appreciated by those of 

skill in the art, other sequences may be utilized while 
still benefitting from the teachings of the present 
invention. Thus, while the present invention is 
described with respect to a particular division of 
25 functions or sequences of events, such divisions or 

sequences are merely illustrative of particular 
embodiments of the present invention and the present 
invention should not be construed as limited to such 
embodiments . 

3 0 Furthermore, while the present invention has been 

described with reference to particular embodiments of the 
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present invention in a System/3 90 environment, as will be 
appreciated by those of skill in the art, the present 
invention may be embodied in other environments and 
should not be construed as limited to System/3 90 but may 
5 be incorporated into other systems, such as a Unix or 

other environments, by associating applications or groups 
of applications with an address rather than a 
communications adapter. Thus, the present invention may 
be suitable for use in any collection of data processing 

10 systems which allow sufficient communication to all of 

the systems for the use of dynamic virtual addressing. 
Accordingly, specific references to System/3 90 systems or 
facilities, such as the "coupling facility," "ESCON, " 
"Sysplex" or the like should not be construed as limiting 

15 the present invention. 

In the drawings and specification, there have been 
disclosed typical preferred embodiments of the invention 
and, although specific terms are employed, they are used 
in a generic and descriptive sense only and not for 

20 purposes of limitation, the scope of the invention being 

set forth in the following claims. 
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